intrinsically motivated exploration
Review for NeurIPS paper: Latent World Models For Intrinsically Motivated Exploration
Summary and Contributions: The paper proposes a novel method to the address the problem of exploration in RL. It is know problem in RL that sparse rewards make random exploration _very_ inefficient. One approach for overcoming such limitations is using intrinsic motivation methods, building an auxiliary reward signal to encourage an agent to seek novel or rare states, for example proportional to inverse visit counts or, as proposed in this paper, some prediction error. Prediction error as a measure of novely can by heaviliy affected by three types of uncertainty by sources: 1. from novelty (epistemic) -- this is the signal we are typically after. This propose a belief state formulation that the authors claim is not too sensitivity to stochasticity and has the ability to extrapolate the state dynamics, such that the prediction error can be a genuine measurement for novelty.
Review for NeurIPS paper: Latent World Models For Intrinsically Motivated Exploration
All reviewers unanimously agree that this paper should be accepted to NeurIPS. The authors did a great job addressing almost all of the reviewer's concerns, leading to three reviewers increasing their score after the author response. Reviewers particularly praised the readability of the paper, the fact that the method is clearly defined, and that the authors did a good job of visually demonstrating how it works. However, the reviewers also agree that CPC Action would be an important baseline to compare to, so I strongly encourage the authors to take the suggested improvements seriously and work towards an improved version of the paper. I am confident that the authors can make the requested changes and am recommending acceptance.
Latent World Models For Intrinsically Motivated Exploration
In this work we consider partially observable environments with sparse rewards. We present a self-supervised representation learning method for image-based observations, which arranges embeddings respecting temporal distance of observations. This representation is empirically robust to stochasticity and suitable for novelty detection from the error of a predictive forward model. We consider episodic and life-long uncertainties to guide the exploration. We propose to estimate the missing information about the environment with the world model, which operates in the learned latent space.
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.66)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.66)
- Information Technology > Data Science > Data Mining > Anomaly Detection (0.66)
- Information Technology > Artificial Intelligence > Machine Learning (0.46)
Intrinsically Motivated Exploration for Automated Discovery of Patterns in Morphogenetic Systems
Reinke, Chris, Etcheverry, Mayalen, Oudeyer, Pierre-Yves
Exploration is a cornerstone both for machine learning algorithms and for science in general to discover novel solutions, phenomena and behaviors. Intrinsically motivated goal exploration processes (IMGEPs) were shown to enable autonomous agents to efficiently explore the diversity of effects they can produce on their environment. With IMGEPs, agents self-define their own experiments by imagining goals, then try to achieve them by leveraging their past discoveries. Progressively they learn which goals are achievable. IMGEPs were shown to enable efficient discovery and learning of diverse repertoires of skills in high-dimensional robots. In this article, we show that the IMGEP framework can also be used in an entirely different application area: automated discovery of self-organized patterns in complex morphogenetic systems. We also introduce a new IMGEP algorithm where goal representations are learned online and incrementally (past approaches used precollected training data with batch learning). For experimentation, we use Lenia, a continuous game-of-life cellular automaton. We study how IMGEPs enable to discover a variety of complex self-organized visual patterns. We compare random search and goal exploration methods with hand-defined, pretrained and online learned goal spaces. The results show that goal exploration methods identify more diverse patterns compared to random explorations. Moreover, the online learned goal spaces allow to successfully discover interesting patterns similar to the ones manually identified by human experts. Our results exemplify the ability of IMGEPs to discover novel structures and patterns in complex systems. We are optimistic that their application will aid the understanding and discovery of new knowledge in various domains of science and engineering.